zsh-bench
Benchmark for interactive zsh.
Summary
zsh-benchmeasures user-visible latency of interactive zsh: input lag, command lag, etc. You can use it to benchmark your own shell.human-benchmeasures human perception of latency when using interactive zsh. You can use it to check how it feels to use zsh with specific latencies or to test whether you can tell a difference between 5ms and 0ms lag.- I've used
human-benchto conduct a blind study on myself to find the maximum values of latencies that are indistinguishable from zero. For example, command lag below 10ms feels just like 0ms but anything above this value starts feeling sluggish. - I've used these threshold values to normalize benchmark results to see what is fast and what is slow.
- Armed with this set of tools I've optimized two of my zsh projects: powerlevel10k and zsh4humans. They used to be fairly fast but now they are literally indistinguishable from instantaneous as far as human perception goes.
- I've benchmarked many zsh techniques, plugins, frameworks and plugin managers and have shared my observations in this document together with a brief conclusion.
Install
Clone the repo:
git clone https://github.com/romkatv/zsh-bench ~/zsh-benchUsage
usage: zsh-bench [OPTION].. [CONFIG]..
OPTIONS
-h,--help
-i,--iters <NUM> [default=16]
-l,--login <yes|no> [default=yes]
-g,--git <yes|no|empty> [default=yes]
-c,--config-dir <directory> [default=<zsh-bench-dir>/configs]
-d,--scratch-dir <directory>
-I,--isolation <docker|user>
-s,--standalone
-r,--raw
Benchmark zsh on your machine
~/zsh-bench/zsh-benchThis requires zsh >= 5.8 and it must be your login shell.
If your zsh startup files start tmux, the benchmark may hang unless your tmux has this fix
for this bug.
If your zsh startup files enable history but don't set histignorespace, you might find random
commands in your history after running zsh-bench.
Benchmark predefined zsh configs
~/zsh-bench/zsh-bench --isolation docker -- <name> [name]..This requires docker. Names of predefined zsh configs are directories under
configs.
With --isolation user benchmarking is done as user zsh-bench on the host.
How it works
zsh-bench creates a virtual TTY and starts a login shell in it. It then sends keystrokes to the
TTY and measures how long it takes for the shell to react. For example, zsh-bench can send
echo hello and echo goodbye to the TTY twice in quick succession and measure how long it takes
for the words "hello" and "goodbye" to be printed.
What it measures
Sample output of zsh-bench:
creates_tty=1
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=14.331
first_command_lag_ms=56.500
command_lag_ms=2.518
input_lag_ms=5.195
exit_time_ms=5.886
The first few fields list detected shell capabilities; the rest are measured latencies.
Shell capabilities (0 or 1):
| name | meaning |
|---|---|
| creates tty | the shell creates its own TTY by invoking tmux or screen |
| has compsys | the shell initializes compsys—the "new" completion system—by invoking compinit |
| has syntax highlighting | user input (the command line) is highlighted by zsh-syntax-highlighting |
| has autosuggestions | suggestions for command completions are offered automatically by zsh-autosuggestions |
| has git prompt | git branch is displayed in prompt |
Latencies (in milliseconds):
| name | what time it measures | if too high |
|---|---|---|
| first prompt lag (ms) | from the start of the shell to the moment prompt appears on the screen | you get to stare at an empty screen for some time whenever you open a terminal |
| first command lag (ms) | from the start of the shell to the moment the first interactive command starts executing | you get to wait for the output of ls if you type it really fast after opening a terminal |
| command lag (ms) | from pressing Enter on an empty command line to the moment the next prompt appears; the same as zsh-prompt-benchmark (my project) | all commands appear to take longer to execute; the slowdown may happen after you press Enter and before the command starts executing, or after the command finishes executing and before the next prompt appears |
| input lag (ms) | from pressing a regular key to the moment the corresponding character appears on the command line; this test is performed when the current command line is already fairly long | keyboard input feels sluggish, as if you are working over an SSH connection with high latency |
| exit time (ms) | how long it takes to execute zsh -lic "exit"; this value is meaningless as far as measuring interactive shell latencies goes |
there is no baseline value for this latency, so it cannot be "too high" |
How fast is fast
Is input lag of 5ms a lot? What about first prompt lag of 100ms? How good/bad would it
be if these latencies were halved/doubled? I've written human-bench to answer questions like this.
It's a small tool that can simulate zsh with latencies of your choice.
usage: human-bench [OPTION]..
OPTIONS
-h,--help
-s,--shell-command <STR> [default="zsh"]
-f,--first-prompt-lag-ms <NUM> [default=0]
-c,--first-command-lag-ms <NUM> [default=0]
-p,--command-lag-ms <NUM> [default=0]
-i,--input-lag-ms <NUM> [default=0]
It turns out that first prompt lag of 100ms causes the first prompt to appear with a noticeable
delay when starting zsh but input lag of 5ms is barely perceptible. Or is it imperceptible?
When I invoke human-bench --input-lag-ms 5 I expect input to lag and this might affect what I'm
seeing. To rule out this bias I've extended human-bench to accept several values of the same
latency:
human-bench --input-lag-ms 0 --input-lag-ms 5human-bench picks one of these latencies at random before starting zsh. However, it doesn't
reveal the choice until I exit the playground. With this tool in hand I conducted a blind study on
myself and found out that I cannot distinguish between these two latencies. As far as my senses are
concerned, input lag of 5ms is as good as zero.
I used this blinding method to find the threshold values of all latencies in my use of zsh. Any value below the threshold is—to me—indistinguishable from zero. I can distinguish values above the threshold from zero with better than 50% accuracy.
| latency (ms) | the maximum value indistinguishable from zero |
|---|---|
| first prompt lag | 50 |
| first command lag | 150 |
| command lag | 10 |
| input lag | 20 |
The first two latencies are related to zsh startup time. I don't start zsh by literally typing zsh
within an existing shell. I either open a terminal, create a new tab, or split an existing tab. The
latter is the most common, so I rigged human-bench to split a tab for the purpose of testing
startup latencies. For the other two latencies I typed and executed simple commands.
Keep in mind that these thresholds may have different values for different people, machines, terminals, etc. I believe the ballpark should be the same though.
Benchmark results
I implemented zsh-bench in order to optimize powerlevel10k
and zsh4humans. In the process I benchmarked many zsh
techniques, plugins, frameworks and plugin managers. I'm sharing some of my findings here.
I recommend reading this section top-to-bottom without jumping back and forth. You can also skip right to conclusions.
All benchmark results in this section have been normalized by the
threshold values. first prompt lag of 25ms becomes 50% and 100ms becomes
200%. Latencies up to 50%, 100% and 200% are be marked with
Basics
| config | tmux | compsys | syntax highlight | auto suggest | git prompt | first prompt lag | first cmd lag | cmd lag | input lag |
|---|---|---|---|---|---|---|---|---|---|
| no-rcs | ❌ | 3% |
1% |
1% |
1% 🟢 |
||||
| tmux | 8% |
3% 🟢 |
1% |
1% 🟢 |
|||||
| compsys | ✔️ | ❌ | 37% |
12% |
1% |
1% 🟢 |
|||
| zsh-syntax-highlighting | ❌ | ✔️ | ❌ | 23% 🟢 |
14% |
6% |
57% |
||
| zsh-autosuggestions | ❌ | 32% |
13% |
96% |
3% |
||||
| git-branch | ❌ | ❌ | 32% |
11% |
50% |
1% |
no-rcs is zsh in its pure form, without any rc files. It's really fast! Even if it was 10 times slower, I wouldn't be able to tell a difference.
The rest of the entries here are the simplest configs capable of providing each capability. For
example, here's .zshrc from zsh-autosuggestions:
source ~/zsh-autosuggestions/zsh-autosuggestions.zshJust one line. Plain and simple. ~/zsh-autosuggestions is supposed to be created manually, outside
of zsh rc files. In the benchmark it's done by a setup script like this:
git clone -q --depth=1 https://github.com/zsh-users/zsh-autosuggestions.git ~/zsh-autosuggestionsThese basic building blocks are composable. We can easily create a config by combining any number of them. Later we'll be doing just that. The latencies of a combination are the sums of latencies of all its constituents. For example, first prompt lag of tmux+compsys is first prompt lag of tmux plus first prompt lag of compsys. You can probably already see that adding everything together will push some latencies over the threshold. Our goal is to avoid that while still getting all the goodies. git-branch gives us git prompt for the price of 48% of our command lag budget. Let's see if we can do better.
Prompt
I've benchmarked several different git prompts.
| config | tmux | compsys | syntax highlight | auto suggest | git prompt | first prompt lag | first cmd lag | cmd lag | input lag |
|---|---|---|---|---|---|---|---|---|---|
| git-branch | ❌ | 32% |
11% |
50% |
1% 🟢 |
||||
| agnoster | ❌ | 65% |
22% |
244% |
1% |
||||
| starship | ❌ | ❌ | ❌ | 82% 🟡 |
28% |
354% 🔴 |
1% |
||
| powerlevel10k | ❌ | 4% 🟢 |
14% |
19% |
1% 🟢 |
The git repo used by the benchmark has 1,000 directories and 10,000 files in it. Not too few,
not too many. All benchmarks ran with untracked cache enabled. Wall time of git status stood at
16ms.
git-branch only shows the name of the current branch and has the same latency regardless of the repository size.
agnoster config uses the classic agnoster zsh theme. It scans the whole repo to see if there are untracked files, unstaged changes, etc. We can see that this causes lag on every command pushing latency into the red. The lag is linear in the number of files and directories in the git repo. You wouldn't want to use this theme in a truly large git repo with hundreds of thousands or millions of files.
starship config uses the cross-shell starship prompt. It suffers from the same performance problems as agnoster plus some. Starship is implemented as an external binary, so it has to pay the price of at least one additional fork+exec on every command compared to native zsh prompts. One fork+exec cannot account for the high lag starship exhibits, so what gives? Under the benchmark conditions starship clones 158 times! That's costly.
powerlevel10k config uses powerlevel10k zsh theme
that I've developed. It scans the git repo just like agnoster and starship but it does not invoke
git to do that. Instead, it uses gitstatus -- another of
my projects. This gives powerlevel10k a nice speedup on repositories large and small. In addition,
powerlevel10k doesn't block zsh prompt while gitstatus is scanning the repo, so command lag stays
constant even in giant repositories. Powerlevel10k has a few other interesting performance-related
properties that we'll explore when we start building real zsh configs.
Premade configs
Let's see what some of the popular premade zsh configs offer out of the box.
| config | tmux | compsys | syntax highlight | auto suggest | git prompt | first prompt lag | first cmd lag | cmd lag | input lag |
|---|---|---|---|---|---|---|---|---|---|
| prezto | 97% |
35% 🟢 |
13% |
1% |
|||||
| ohmyzsh | 187% |
64% |
366% 🔴 |
2% |
|||||
| zim | ✔️ | 122% 🟠 |
53% 🟡 |
191% 🟠 |
64% 🟡 |
||||
| zsh4humans | ✔️ | ✔️ | 19% 🟢 |
36% |
27% |
25% |
The names of these configs match the respective public projects from which they were copied: ohmyzsh, prezto, zim and zsh4humans. The latter is my project. All configs were used unmodified.
prezto is fast but doesn't provide much out of the box. No syntax highlighting, autosuggestions or git prompt. Users who need these features—most do—should enable them explicitly.
ohmyzsh and zim by default use a theme with similar performance characteristics of agnoster, so they have high command lag in larger git repositories. zim is the only config among these that doesn't detect untracked files. This appears to have been a conscious decision by zim developers for performance reasons. We can see that it helps: zim has approximately half the command lag of ohmyzsh.
zim enables syntax highlight and autosuggestions by default, so it naturally has higher input lag than projects that don't. Fast zsh startup is a major explicit goal of zim and we can see that it beats ohmyzsh on first prompt lag and first command lag.
zsh4humans ticks all capability checkboxes and has all latencies comfortably in the green zone.
This shouldn't be surprising. In the game of optimization, measuring is half the work. I had access
to zsh-bench, so I was able to optimize zsh4humans to score well on it. Prior to creating
zsh-bench I knew that input lag and first command lag in zsh4humans were sometimes noticeable
but it was difficult to evaluate the effectiveness of potential optimizations when I not only
couldn't measure these latencies but didn't even have clear concepts for them.
Note that all these projects have extra features that aren't reflected in the capabilities shown in the table. They all enable persistent history, define aliases, set shell options, etc.
Do it yourself
Let's leave premade configs alone for some time and try to build a zsh config from scratch. Given the availability of high-quality building blocks, this shouldn't be very difficult.
| config | tmux | compsys | syntax highlight | auto suggest | git prompt | first prompt lag | first cmd lag | cmd lag | input lag |
|---|---|---|---|---|---|---|---|---|---|
| diy | ✔️ | 118% |
47% |
155% 🟠 |
61% |
||||
| diy+ | ✔️ | ✔️ | 10% |
51% 🟡 |
24% 🟢 |
63% |
|||
| diy++ | 10% |
42% 🟢 |
24% |
64% |
diy is the simplest config that provides all capabilities. I've made it by concatenating configs
of the basic building blocks. Here's the whole .zshrc:
# If not in tmux, start tmux.
if [[ -z ${TMUX+X}${ZSH_SCRIPT+X}${ZSH_EXECUTION_STRING+X} ]]; then
exec tmux
fi
# Enable the "new" completion system (compsys).
autoload -Uz compinit && compinit
# Configure prompt to show the current working directory and git branch.
autoload -Uz vcs_info add-zsh-hook
add-zsh-hook precmd vcs_info
PS1='%~ $vcs_info_msg_0_'
setopt prompt_subst
# Enable syntax highlighting.
source ~/zsh-syntax-highlighting/zsh-syntax-highlighting.zsh
# Enable autosuggestions.
source ~/zsh-autosuggestions/zsh-autosuggestions.zshTwo latencies are over the threshold, so some lag can be noticeable. However, before using this config you'll want to add more stuff to it -- key bindings, environment variables, aliases, completions options, etc. All these things will make zsh slower. If you aren't careful, a lot slower.
diy+ improves on diy by replacing its prompt with powerlevel10k. This has
dramatic positive effect on first prompt lag and command lag. Moreover, first prompt lag
is now constant and won't increase if more stuff is added to the config. This means you'll never
have to stare at an empty screen when opening terminal -- prompt will be there right from the start.
The additional initialization code will only affect first command lag, which has the highest
threshold of all latencies and has the least impact on the perception of zsh performance. I've also
made .zshrc in this config self-bootstrapping to obviate the need to maintain a separate install
or setup script for the cloning of zsh plugin repositories. This is primarily to make comparisons
with plugin managers in the future sections easier.
diy++ adds one more optimization -- it compiles large zsh files to wordcode. This reduces first command lag a little bit. This config performs well and is still relatively simple.
Cutting corners
There are several optimizations that speed up zsh startup but can easily backfire. diy++unsafe adds three such optimizations on top of diy++ to reduce first command lag by 5% of the budget. I don't recommend them.
| config | tmux | compsys | syntax highlight | auto suggest | git prompt | first prompt lag | first cmd lag | cmd lag | input lag |
|---|---|---|---|---|---|---|---|---|---|
| diy++unsafe | ✔️ | 9% |
37% 🟢 |
24% |
63% |
The first optimization is to compile to wordcode .zshrc itself. This will cause you a lot of grief
if you do something like this:
% cp ~/.zshrc ~/.zshrc.bak # backup .zshrc before messing with it
% vi ~/.zshrc # change stuff
% exec zsh # restart zsh to try the new changes
% mv ~/.zshrc.bak ~/.zshrc # revert changes
% exec zsh # restart zshAt this point you'll be surprised to find that the last command seemingly had no effect. Zsh is
still using .zshrc that you modified in vi. This happens because mv preserves file
modification time and zsh looks at it to figure out whether the wordcode matches the source code.
Compiling .zshrc to wordcode will also prevent you from using aliases in .zshrc if they were
defined in the same file. For example:
alias ll='ls -l'
lll() { ll "$@" | less; }If you put this code in .zshrc, you'll notice that lll doesn't work if .zshrc is compiled.
Another dangerous optimization is to invoke compinit with -C. If you do that and install a new
tool using your favorite package manager, completions for this tool may not appear in zsh even after
restart. You'll have to manually delete a cache file. Saving a few milliseconds on zsh startup is
not worth it if later you'll have to spend an hour trying to figure out why completions don't work.
The last over-eager optimization is to print the first prompt before checking whether plugins need to be installed. The section on Instant Prompt explains why this is a bad idea.
Powerlevel10k
Before going further let's look at powerlevel10k more closely. This theme can display a lot of
information in prompt: disk usage, public IP address, VPN status, current kubernetes context,
taskwarrior task count, etc. The configuration of powerlevel10k affects its latency. All benchmarked
configs that use powerlevel10k employ the same small config that only shows the current working
directory and git status in prompt. In addition to this I've also measured (and optimized -- this
was the whole point of working on zsh-bench) the performance of powerlevel10k with everything
turned on.
| config | tmux | compsys | syntax highlight | auto suggest | git prompt | first prompt lag | first cmd lag | cmd lag | input lag |
|---|---|---|---|---|---|---|---|---|---|
| powerlevel10k | ❌ | ❌ | ❌ | 4% 🟢 |
14% 🟢 |
19% |
1% |
||
| powerlevel10k-full | 8% 🟢 |
27% |
64% 🟡 |
6% 🟢 |
powerlevel10k-full has substantially higher command lag but it's still under 100%, meaning that prompt is still indistinguishable from instantaneous. However, there is not much command lag budget left for doing extra things on every command. So you would need to be careful with your zsh config if you were to use powerlevel10k with everything turned on. In practice, no sane person would enable everything. Here's a ridiculously overwrought prompt:
It has 18 segments. The full config enables 64!
powerlevel10k-full increases input lag by 1.2ms compared to the smaller config. Powerlevel10k can dynamically update prompt depending on the current command you are typing. Here's an example from powerlevel10k docs where the current kubernetes context and gcloud credentials are shown only when they are relevant to the current command.
This feature requires parsing the command line as it changes, hence extra input lag. The impact on latency is small, so it shouldn't cause any problems.
Instant prompt
I mentioned earlier that powerlevel10k makes first prompt lag small and, importantly, independent from anything else you have in zsh startup files. This feature is called Instant Prompt in powerlevel10k docs (a more appropriate name would have been Instant First Prompt) and it's worth looking at how it works.
When you open the homepage of Google in a web browser, it appears to load almost instantly even though there is a lot of fancy functionality built into it. If we look under the hood, the whole page takes a long time to load but most of this loading happens after the UI has been rendered. It doesn't take much time to render an input box for the query and the search button, so it looks instantaneous. The initial UI may look like the real thing but it's only a stub. If you enter a query and click the button quickly enough, search results won't appear. The button doesn't have the necessary logic yet, so it'll just remember that it was clicked. The query will go through once the page loads.
This trick works really well because you can start typing right away. Typing is very slow by machine standards, so by the time you are finished the page has almost always been fully loaded and you don't notice any delay when clicking the button.
Powerlevel10k uses the same trick, only in case of zsh the UI you see on startup is the prompt. As soon as you open a terminal, powerlevel10k prints prompt. This first prompt only has information that can be computed quickly (the current working directory, username, hostname, current time, python virtual environment, etc.) but nothing that can require a lot of time, so no git status. While you are typing the first command, zsh continues to initialize -- loading plugins, setting up completions, defining aliases, enabling key bindings, retrieving git status, etc. Once zsh is fully initialized, the original limited prompt is replaced with the full prompt and whatever you have typed is replayed in Zsh Line Editor (zle). If you've enabled syntax highlighting, at this point the command line gets highlighted. Here's how it looks: